Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add an atexit handler to bypass calls into ERR_ string routines. #35925

Merged
merged 3 commits into from
May 14, 2020

Conversation

bartonjs
Copy link
Member

@bartonjs bartonjs commented May 6, 2020

This works around Ubuntu's apparent lack of NO_ATEXIT support in their
build of OpenSSL.

Fixes #34231.

This works around Ubuntu's apparent lack of NO_ATEXIT support in their
build of OpenSSL.
@bartonjs bartonjs added area-System.Security os-linux Linux OS (any supported distro) labels May 6, 2020
@bartonjs bartonjs added this to the 5.0 milestone May 6, 2020
@bartonjs bartonjs requested a review from janvorli May 6, 2020 23:01
@ghost
Copy link

ghost commented May 6, 2020

Tagging subscribers to this area: @bartonjs, @vcsjones, @krwq
Notify danmosemsft if you want to be subscribed.


// Now, spin until existing calls all finish, and we can move on with shutdown.
int x;
while ((x = __atomic_load_n(&g_str_read_count, __ATOMIC_SEQ_CST)))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should this do something like sleep(1) to avoid monopolizing a core?


// Now, spin until existing calls all finish, and we can move on with shutdown.
int x;
while ((x = __atomic_load_n(&g_str_read_count, __ATOMIC_SEQ_CST)))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is a race between this method and the CryptoNative_ErrReasonErrorString (the other one too). The CryptoNative_ErrReasonErrorString can check the g_err_unloaded before we set it here and the __atomic_load_n can load the g_str_read_count before the CryptoNative_ErrReasonErrorString updates it, thus leaving us with the same problem that we are trying to fix. And even if there was not a race, it would not work reliably e.g. on ARM with its weak memory model, c++ volatile is just an indication to the compiler that it should perform each read / write. It doesn't add any memory barriers. So the thread calling the error getting methods doesn't have to see the g_err_unloaded change in a limited time.
It seems that we should use regular lock (pthread_mutex) around setting the g_err_unloaded here and in both ERR_reason_error_string and CryptoNative_ErrErrorStringN around the check of g_err_unloaded and calls to openssl. Also, the g_err_unloaded can then be just volatile, the locks will provide the necessary barriers.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yep, good points. I started with a rwlock, but that's not in (g)libc, so I had to take a direct dependency on libpthread. Since we haven't anywhere else I then backed that out and came up with this, but with the usual "cleverness" consequences of missing something.

So I'll just turn it back into a mutex. These routines are only hit on loading exception messages, so it shouldn't be too much of a penalty.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

libcoreclr.so already depends on libpthread, so it is not a big deal if we add dependency here too.

// Set this first, which stops new callers from using the error string tables.
g_err_unloaded = 1;

// Now, spin until existing calls all finish, and we can move on with shutdown.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it possible such calls could block or otherwise take an unbounded amount of time?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nah, they are just a TryGetValue and a snprintf.

@bartonjs
Copy link
Member Author

bartonjs commented May 8, 2020

OK, I think this is ready for review again.

Copy link
Member

@janvorli janvorli left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, thank you!

@bartonjs bartonjs merged commit 359218e into dotnet:master May 14, 2020
@bartonjs bartonjs deleted the defensive_atexit branch May 14, 2020 15:29
@ghost ghost locked as resolved and limited conversation to collaborators Dec 9, 2020
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
area-System.Security os-linux Linux OS (any supported distro)
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Exit code 139 on 'dotnet new'
4 participants